vignettes/geospatial_visualization.Rmd
geospatial_visualization.RmdBoth the coronavirus and covid19_vaccine datasets provide country-level information on the covid19 cases and vaccination progress, respectively. One common approach for communicating country-level data is with the use of choropleth maps. The focus of this vignette is to demonstrate how to merge the dataset with the geometric metadata and plot it. We will use the following packages:
rnaturalearth - Provides geo-spatial metadata from the Natural Earth dataset. More details available here
sf - A package that provides simple features access for R
mapview - A wrapper for the leaflet library
tmap - A package for creating a thematic maps
ggplot2 - Is a system for declaratively creating graphics
viridis - A package that provide a series of color maps
Note: This vignette is not available on the CRAN version (due to size limitation). Therefore, as the packages above are not on the dependencies list of the coronavirus package, you may need to install them before.
Let’s get start by loading the data:
library(coronavirus) data("covid19_vaccine") head(covid19_vaccine) #> country_region date doses_admin people_partially_vaccinated #> 1 Afghanistan 2021-02-22 0 0 #> 2 Afghanistan 2021-02-23 0 0 #> 3 Afghanistan 2021-02-24 0 0 #> 4 Afghanistan 2021-02-25 0 0 #> 5 Afghanistan 2021-02-26 0 0 #> 6 Afghanistan 2021-02-27 0 0 #> people_fully_vaccinated report_date_string uid province_state iso2 iso3 code3 #> 1 0 2021-02-22 4 <NA> AF AFG 4 #> 2 0 2021-02-23 4 <NA> AF AFG 4 #> 3 0 2021-02-24 4 <NA> AF AFG 4 #> 4 0 2021-02-25 4 <NA> AF AFG 4 #> 5 0 2021-02-26 4 <NA> AF AFG 4 #> 6 0 2021-02-27 4 <NA> AF AFG 4 #> fips lat long combined_key population continent_name continent_code #> 1 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS #> 2 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS #> 3 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS #> 4 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS #> 5 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS #> 6 <NA> 33.93911 67.70995 Afghanistan 38928341 Asia AS
We will use the ne_countries function from the rnaturalearth package to pull the country geometric data:
library(dplyr) map <- ne_countries(returnclass = "sf") %>% dplyr::select(name, iso2 = iso_a2, iso3 = iso_a3, geometry) head(map) #> Simple feature collection with 6 features and 3 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -73.41544 ymin: -55.25 xmax: 75.15803 ymax: 42.68825 #> CRS: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 #> name iso2 iso3 geometry #> 0 Afghanistan AF AFG MULTIPOLYGON (((61.21082 35... #> 1 Angola AO AGO MULTIPOLYGON (((16.32653 -5... #> 2 Albania AL ALB MULTIPOLYGON (((20.59025 41... #> 3 United Arab Emirates AE ARE MULTIPOLYGON (((51.57952 24... #> 4 Argentina AR ARG MULTIPOLYGON (((-65.5 -55.2... #> 5 Armenia AM ARM MULTIPOLYGON (((43.58275 41...
df <- map %>% left_join( covid19_vaccine %>% filter(date == max(date), is.na(province_state)) %>% mutate(perc = round(100 * people_fully_vaccinated / population, 2)) %>% select(country_region, iso2, iso3, people_fully_vaccinated, perc, continent_name), by = c("iso2", "iso3") ) class(df) #> [1] "sf" "data.frame"
After we merged the country data with the corresponding geometry data it is straightforward to plot the data as sf object.
The mapview package, a wrapper for the leaflet library, enables to plot sf objects seamlessly. Let’s start by plotting the percentage of the population that fully vaccinated by country using the perc variable:
df %>% mapview::mapview(zcol = "perc")
By default, the function is using continues color scale for the objects (in this case countries) color. We can modify it and set color buckets by using the at argument. Also, we can define the legend title with the use of the layer.name argument:
By default, the function is using continues color scale for the objects (in this case countries) color. We can modify it and set color buckets by using the at argument. Also, we can define the legend title with the use of the layer.name argument:
df %>% mapview::mapview(zcol = "perc", at = seq(0, max(df$perc, na.rm = TRUE), 10), legend = TRUE, layer.name = "Fully Vaccinated %")
Some of the missing values in the plot are un-populated areas such as Antarctica or a territory that is count under different state such as Greenland. We can remove those and re-plot the map:
=======Some of the missing values in the plot are un-populated areas such as Antarctica or a territory that is count under different state such as Greenland. We can remove those and re-plot the map:
>>>>>>> devdf1 <- df %>% filter(!name %in% c("Greenland", "Antarctica")) df1 %>% mapview::mapview(zcol = "perc", at = seq(0, max(df1$perc, na.rm = TRUE), 10), legend = TRUE, layer.name = "Fully Vaccinated %")
Last but not least, you can customize the color palette with the use of the col.regions argument. In the following example, we will use the plasma function from the viridis package:
Last but not least, you can customize the color palette with the use of the col.regions argument. In the following example, we will use the plasma function from the viridis package:
The tmap (thematic maps) package is another useful tool for creating choropleth maps. Similarly to the mapview package, the tmap package supports sf objects. This tmap follow the ggplot2 package syntax style (e.g., using the + symbol to add to the plot additional plot). We will use the tm_shape function to define the input object with the df1 object, and add the tm_polygons function to customize the plot:
tm_shape(df1) + tm_polygons(col = "perc", n = 8, title = "Fully Vaccinated %", palette = "Blues")

The projection argument enables to set the map projections method using PROJ.4 code:
tm_shape(df1) + tm_polygons(col = "perc", n = 8, projection = 3857, title = "Fully Vaccinated %", palette = "Blues")

We can use the continent_name column to filter and plot the data by specific continent. For example, let’s plot the countries in South America:
df2 <- df1 %>% filter(continent_name == "South America") tm_shape(df2) + tm_polygons(col = "perc", n = 5, title = "Perc. Group", palette = "Blues")

Let’s customize the plot and use the tm_stype function to set background and the tm_layout function to add title:
tm_shape(df2) + tm_polygons(col = "perc", n = 5, title = "Perc. Group", palette = "Blues") + tm_style("cobalt") + tm_text("iso3", size = 0.7) + tm_layout( title= "% of Population Fully Vaccinated", title.position = c('right', 'top') , inner.margins = c(0.02, .02, .1, .25))

You can map the output interactive by setting the tmap_mode function to a view mode:
tmap_mode("view") tm_shape(df2) + tm_polygons(col = "perc", n = 5, title = "Perc. Group", palette = "Blues")
Last but not least, we can add facets with the use of the tm_facets and tmap_options functions:
Last but not least, we can add facets with the use of the tm_facets and tmap_options functions:
tmap_mode("plot") tm_shape(df2) + tm_polygons(col = "perc", n = 5, title = "Perc. Group", palette = "Greens") + tmap_options(limits = c(facets.view = 13)) + tm_facets(by = "name")

The geom_sf function enables to read and plot sf objects with ggplot2, for example:
ggplot(data = df2, aes(fill = `perc`)) + geom_sf() + scale_fill_viridis_b()

The scale_fill_viridis_b define the color setting. By default, the y and x axises are representing the earth coordinates. We can remove those coordinates with the theme_void function customize the color palette with the scale_fill_viridis function:
ggplot(data = df2, aes(fill = `perc`)) + geom_sf(size = 0.1) + scale_fill_viridis(alpha = 0.9, begin = 0.01, discrete = FALSE, end = 0.9) + geom_sf_label(aes(label = name)) + labs(fill = "Percentage", title = "'% of Population Fully Vaccinated", caption = "Source: Johns Hopkins University Center for Systems Science and Engineering") + theme_void()

geom_sf function - https://ggplot2.tidyverse.org/reference/ggsf.html